Physics informed contour selection for rapid image segmentation

Effective training of deep image segmentation models is challenging due to the need for abundant, high-quality annotations. To facilitate image annotation, we introduce Physics Informed Contour Selection (PICS)—an interpretable, physics-informed algorithm for rapid image segmentation without relying on labeled data. PICS draws inspiration from physics-informed neural networks (PINNs) and an active contour model called snake. It is fast and computationally lightweight because it employs cubic splines instead of a deep neural network as a basis function. Its training parameters are physically interpretable because they directly represent control knots of the segmentation curve. Traditional snakes involve minimization of the edge-based loss functionals by deriving the Euler–Lagrange equation followed by its numerical solution. However, PICS directly minimizes the loss functional, bypassing the Euler Lagrange equations. It is the first snake variant to minimize a region-based loss function instead of traditional edge-based loss functions. PICS uniquely models the three-dimensional (3D) segmentation process with an unsteady partial differential equation (PDE), which allows accelerated segmentation via transfer learning. To demonstrate its effectiveness, we apply PICS for 3D segmentation of the left ventricle on a publicly available cardiac dataset. We also demonstrate PICS’s capacity to encode the prior shape information as a loss term by proposing a new convexity-preserving loss term for left ventricle. Overall, PICS presents several novelties in network architecture, transfer learning, and physics-inspired losses for image segmentation, thereby showing promising outcomes and potential for further refinement.


Methods
PICS aims to achieve segmentation for a given 2D or 3D image.In scenarios where experts possess information about the shape of the target object, PICS leverages this information to enhance its performance.The flowchart of the PICS algorithm is depicted in Fig. 1.In this section, we will go into details of its individual components, i.e., (a) PICS hypothesis, (b) the loss function and optimization,(c) the prior shape-based loss term, and (d) the operation performance index (OPI)-a metric to monitor the optimization performance of PICS.

PICS hypothesis
We approximate the target solution, i.e., object boundary with a parametric spline.The expression of parametric spline ψ is given by where the local cubic spline � ψ i (s i < s < s i+1 ) is given by The algorithm takes as input an image (INPUT-1) and, optionally, prior shape knowledge (INPUT-2) of the target in certain cases.The user initiates the snake by hovering over the image and clicking on it once.The initial weights of the snake denote the control knots of the cubic spline, functioning as PICS's hypothesis.Subsequently, we compute losses based on the internal and external energy of the snake model, and optionally, a prior-shape-based loss.The total loss is minimized through gradient descent, leading to updates in the values of spline control knots.
In the given equation, s is a parameter that varies from 0 to 1. [u i , v i ] ′ denotes the spatial coordinates of the local spline ψ i .For given [u i , v i ] ′ , the coefficients of local splines, denoted by [a i , b i , . . ., h i ] ′ , are computed by satisfying the conditions of continuity, smoothness, and periodicity.Mathematically,

Remarks
Conceptually, PICS architecture can be viewed as a special case of Distributed PINNs 12 , where localized shallow PINNs are employed.In the later sections, we will note that 3D segmentation is also inspired by time-marching Distributed PINNs.However, there are distinctions between PINN and PICS which are outlined as follows: • Architecture: PINNs use a deep neural network with a large number of parameters to approximate the solution, whereas PICS employs cubic splines that can efficiently approximate any closed contour with only a few control knots.It is already reported by Dwivedi and Srinivasan 15 that the use of simplified architectures improves the speed of PINN.• Mathematical nature: PINN's hypothesis is nonlinear.In contrast, the hypothesis in PICS is linear.
• Interpretation of weights: Unlike in PINN, the spline coefficients are not directly considered trainable weights in PICS.In PICS, the weights are the sampling points or the control knots themselves, i.e., [u i , v i ] ′ .There are two main reasons for preferring control knots over spline coefficients: 1. Opting to directly choose spline coefficients results in a large number of weights, making it impossible for us to visualize and physically interpret loss gradients in a 2D plane.Conversely, if we designate the control knots as weights, any number of weights can be represented in a 2D plane, and the loss gradient then physically signifies the force acting on the control knots.2. Weight decay is often used for good generalization during the training of deep neural networks.
However, weight decay is purely a data-driven technique.In contrast, PICS weights are subject to physical constraints such as continuity, smoothness, and periodicity.Consequently, any erratic oscillation in one weight is counteracted by the collective influence of all the others.

Loss function and optimization
PICS combines snake's internal energy terms 10 with a Chan-Vese loss 16,17 term for external energy.Chan-Vese loss functional is less sensitive to noisy pixel-level annotations, which is beneficial when dealing with imperfect or noisy ground truth data.The formula for the loss function is given by where In the above expressions, I denotes the image, ( u, v) denotes spline knots, and N denotes the number of spline control knots.χ denotes a characteristic function or mask that is generated by repeated geometric queries, that is, given a single polygon through spline knots and a sequence of query points (grid points), find if the query point lies inside or outside the polygon using point in polygon algorithms 18 .µ in , µ out denote the average pixel value of the image within and outside the spline contour.N x , N y denote number of pixels in x and y direction respectively.With respect to Fig. 1, the external energy term of the loss is J ext = µJ cv , and the internal energy term is J int = αJ ψ s + βJ ψ ss .The hyperparameter γ aims to make the pixel intensities inside the contour more uniform.While doing the weight update by gradient descent, the derivative of loss with respect to weights is compulsory.For example, The expression for weight update using gradient descent is given by where w denotes weight or spline control knots, is the learning rate and ∂J ∂w is the loss gradient.

Remarks
• The loss gradient cannot be calculated directly or even by automatic differentiation 19 because there is no explicit differentiable function that maps control knots with mask.Therefore to calculate derivatives, we use central difference scheme.This is the advantage of minimizing the energy functional instead of using PINN-like PDE residual, as it relaxes the differentiability requirements.For faster convergence and adaptive learning rate, we have used Adam 20 optimizer for our numerical experiments.• In most deep neural networks, the weights are typically initialized randomly and do not have any physical significance.However, PICS is formulated in such a way that trainable weights are represented by the control knots of cubic splines.It gives it a clear physical meaning to weights that simplifies scaling and normalization steps.Similarly, the loss gradient in PICS can be physically interpreted as the force on the control knots pushing them towards the direction of gradient descent.

Prior shape-based loss term
In cases where domain experts possess information about the shape of the target object, PICS can leverage this valuable information.This is accomplished by introducing prior-shape based loss terms, allowing the algorithm to incorporate and benefit from the expert-provided shape information.This paper will use PICS to generate annotations for the left ventricle in the cardiac MRI scan images.
• Domain knowledge: A representative cardiac MRI scan shown in the left-hand side of Fig. 2 is composed of three main parts: left ventricle, right ventricle, and myocardium.In clinical applications of cardiac left ventricle (LV) segmentation, it is desirable to include the cavity, trabeculae, and papillary muscles, which collectively form a convex shape, as shown by the right-hand side of Fig. 2 where some reference annotations for left ventricle are depicted.• Why left ventricle?: Trabeculae and papillary muscles have similar intensities to the myocardium, which can cause segmentation algorithms to incorrectly classify them as myocardium.The problem here is to find a way to accommodate medical domain knowledge with a purely data or image-driven algorithm.• Proposed loss term: To address this challenge, Shi and Li 14 developed a method that preserves the convexity of the left ventricle by controlling the curvature in the level set framework.Similarly, in PICS, we propose a new loss term that preserves convexity in the snake framework.This loss term is expressed as follows: where κ denotes the curvature of contour, N denotes number of spline control knots and σ is a hyperparameter.This penalty term ensures that the shape of the predicted boundary remains convex-shaped.
Please note that such an information about the shape of the object is not always available.In those cases, as Fig. 1 shows, PICS works with just J int and J ext .

Operation performance index (OPI)
The region-based loss function is comprised of both shape regularization and external energy (or mean square error) terms.During optimization, our aim is to reduce the total loss.However, if the relative strengths of the shape regularization and MSE terms need to be appropriately balanced, the solution may get stuck in a local minimum. (8) ṽss ũs − ũss ṽs To prevent this issue, we introduce a new performance evaluation metric known as the Operation Performance Index (OPI).A value of one for OPI indicates that PICS is moving in the right direction, while a value below a predetermined threshold (e.g., 0.8) indicates that parameter adjustments are necessary.However, what constitutes the "right direction"?In the best-case scenario, both external and internal energies should drop as optimization proceeds.However, if that is not possible, the external energy should always drop whether shape regularization loss drops or not.This idea is mathematically contained in the following formula for OPI: where k , w, sign, and represent iteration number, iteration window size, signum function, and dot product, respectively.△J represents vector of difference in J.The exponential smoothing term ensures that the recent values are given more weightage.
OPI can also be used for hyperparameter tuning.For instance, the values of the hyperparameters α and β can be determined through trial and error.But, the third parameter, µ , which is initially set to 1e3, can be adjusted using OPI.When OPI falls below the threshold, the update rule for µ is given by However, we do not continue adjusting µ indefinitely.We stop adjusting µ once the order of J ext becomes more than 1e4 times that of J int to prevent the snake from becoming too loose.
When examining the total loss history alone, it is difficult to determine whether the optimization is progressing correctly.However, looking at the OPI trend, we can check if its value is very low or wildly fluctuating between 0 and 1.Based on this information, the hyperparameters may be adjusted.We will cover the application of this idea in the next section.

Rapid 3D segmentation with transfer learning
If F represents a nonlinear transformation that takes image I(x, y) and initial weights w as input and gives weight update as output.We can model the 3D segmentation process as follows: where the weights are initialized by a single mouse click of the user, i.e.,

It is mathematically equivalent to solving an unsteady PDE given by
This PDE approach is inspired by Distributed PINNs 12 , and the effect of transfer learning is "spatially-aware initialization" from second slice onwards.

Remarks
This section described the PICS methodology and provided a comparative analysis with PINN.Before proceeding to the next section, it's important to emphasize how this methodology effectively addresses certain limitations inherent in traditional snakes.
• Sensitivity to initialization: One of the primary challenges with snakes is their sensitivity to initialization.If the contour is initialized far from the actual boundaries, it might not converge due to the presence of local minima.In the case of PICS, a domain expert clicks on the image, ensuring that the initial snake is within the image and not too distant from the true boundary.• Sensitivity to noise: Chan-Vese loss functional is less sensitive to noisy pixel-level annotations, which is beneficial when dealing with imperfect or noisy ground truth data.• Complicated mathematical formulation: PICS directly works with loss functionals, eliminating the necessity of deriving Euler-Lagrange equations.The optimization process involves straightforward gradient descent, facilitating the minimization of all types of loss functionals.• Difficult to incorporate shape-priors: PICS can easily incorporate shape priors as loss terms.( 9)

Results
In this section, we demonstrate the effectiveness of PICS in 2D and 3D segmentation with and without prior shape information.In all the cases, Adam optimizer 20

Test cases
For testing PICS, we have considered the following cases: • 2D test cases : (TC-1) 2D segmentation of CT scan of enlarged ventricles of hydrocephalus patient (Case courtesy of Paul Simkin, Radiopaedia.org,rID: 30453).The source of data is: https:// radio paedia.org/ cases/ obstr uctive-hydro cepha lus, (TC-2A, TC-2B) 2D segmentation (without and with shape prior loss) of MRI scan with indistinct left ventricle from ACDC dataset 13 .• OPI test cases: (TC-3A, TC-3B, TC-3C, TC-4) 2D segmentation without and with adaptive hyperparameters of synthetic image of a cavity [21][22][23] .This is a standard test case where traditional snake models have been observed to be unsuccessful in navigating through concavities, (TC-4) 2D segmentation of Texas state from the map of the United States of America (USA).• 3D test cases: (TC-5) 3D segmentation of MRI scans of cardiac data of 100 patients from the ACDC dataset 13 in the ED, i.e., End-Diastolic phase.Source:https:// www.creat is.insa-lyon.fr/ Chall enge/ acdc/ index.html

Evaluation metrics
• Qualitative evaluation: Except for the ACDC dataset images, the objects in both 2D and OPI test cases are simple and can be assessed visually without the need of any expert labels.• Quantitative evaluation: The last test, which uses the ACDC dataset, requires expert interpretation and therefore is evaluated by comparing the results with annotations provided by medical professionals.The Dice score is used as the evaluation metric to compare the results.The formula for dice score is where A and B are the two sets being compared.The Dice value ranges from 0 to 1, where 0 indicates no common elements between the sets and 1 indicates identical sets.

Hyperparameter selection
The loss function in PICS comprises internal and external energy terms.The hyperparameters govern the relative significance of these terms, and their absolute values do not hold any particular significance.Consequently, there is no unique search space for hyperparameters.In this paper, the process of hyperparameter selection has not been automated and is definitely a topic of future work.Nevertheless, the adopted trial-and-error approach is as follows: • Begin by focusing on internal energy terms.Choose some values for (α, β) such that the orders of magnitude of αψ 2 s and βψ 2 ss are comparable for simple shapes like circle and rectangle.For this, typically the order of magnitude of β is one or two less than order of magnitude of α .For example if α = 0.1 , then β = 0.01 or 0.001.
• Moving on to the external energy term, the value of µ is fine-tuned to ensure that the order of magnitude of the Chan-Vese loss is not lower than that of the internal energy term.In any case, the decrease in internal energy should not be favored over the increase in external energy.OPI-based hyperparameter tuning essentially accomplishes this.The coefficient of the image gradient, denoted as γ , is typically set to zero but may be adjusted only in special cases when the object boundary is significantly blurred.• Lastly, when there is prior shape information available for the Left Ventricle, the value of σ needs to be set.In typical scenarios, its value is kept low.However, for the indistinct muscles category, where domain expertise should take precedence over data, the value of σ is increased by a factor of 10.This adjustment ensures that prior shape-based loss dominates over the Chan-Vese loss terms.

Control knots initialization
The initial state of the snake is set as a circle, typically with a relatively small radius compared to the object intended for segmentation.The user determines the circle's center position by hovering over the image and clicking the mouse.The control points are evenly spaced along the circle's circumference.

2D cases (TC-1, TC-2)
The PICS settings for all the 2D cases are summarized in Table 1.The main findings from the 2D cases are as follows: • TC-1: In the first case, we consider a CT scan of the enlarged ventricles of a hydrocephalus patient.Figure 3 shows the segmentation results.The CT scan shows two enlarged ventricles.The left-hand side of Fig. 3 shows   www.nature.com/scientificreports/auto_ examp les/ segme ntati on/ plot_ chan_ vese.html) and snake (https:// scikit-image.org/ docs/ stable/ auto_ examp les/ edges/ plot_ active_ conto urs.html) models in Fig. 6.In the level set formulation, the Chan-Vese model does not distinguish between the bright intensities of the left and right ventricle.On the other hand, the traditional snake model, even when initialized very close to the true mask, adheres to locations with sharp pixel gradients.

OPI cases (TC-3A, 3B, 3C and TC-4)
• TC-3A and TC-3B: cavity case without adaptive hyperparameters.Figure 7 displays two instances of bad minima.The first example shows a shrunken snake due to a high value of the bending coefficient.In contrast, the second shows the snake getting trapped in a local minimum because the increase in loss value caused by extension is greater than the drop in loss value due to the Chan-Vese loss.The loss history and OPI trend for each case are shown in Figs. 8 and 9 respectively.When examining the total loss history alone, it is difficult to determine whether the optimization is progressing correctly.However, looking at the OPI trend, we can see that its value is very low (zero) for the first case and fluctuates wildly between 0 and 1 for the second.Therefore, we can rely on the OPI to determine that PICS is stuck in a local minimum and requires adjustments to its hyperparameters to overcome this issue.• TC-3C: cavity case with adaptive hyperparameters.Figure 10 demonstrates that PICS is capable of accurately capturing concave regions.Figures 11 and 12 provide additional information on this particular case, including the OPI score, loss history, and adaptive tuning of hyperparameters.These figures show that hyperparameters are adjusted as needed and a high value of OPI ensures that optimization is progressing in correct direction.No adjustments like gradient vector field 21 or balloon forces 24 are required in PICS framework.
• TC-4: USA map with adaptive hyperparameters.In the Figs. 13 and 14, we segment Texas state borders from the USA map.These figures show that hyperparameters are adjusted as needed.It should be noted that the PICS should be initialized near New Mexico region to provide large number of control knots for capturing sharp boundaries.• 3D PICS procedure: Figure 15 demonstrates the 3D segmentation process of the left ventricle in the PICS framework.The initialization process begins with a single click within the left ventricle on the first MRI image.Subsequently, the optimal PIC weights obtained from the previous image are transferred as the initial condition for the next image.The transfer learning accelerates the convergence of the PIC to its optimal value.This iterative process is continued for all the remaining images.• Speed up: Figure 16 illustrates an example of the speed-up in convergence due to this transfer learning process.Usually, the initial segmentation is finished in less than 30 s, while all the remaining segmentations also typically conclude within 30 s.Therefore, the full 3D segmentation takes about a minute to complete.• Hyperparameters selection: The images in the ACDC dataset contain normal cases, indistinct muscle cases and cases with very thin myocardium (see Fig. 17).PICS uses different hyperparameter settings for different image categories in the ACDC dataset.For the normal case, the hyperparameters roughly have values of ( α,β ,µ,γ,σ)=(1e−1,1e−2,1e4,1e−5,1e7).For the indistinct muscles category, the value of σ is increased by a factor of 10 while keeping all other hyperparameters fixed.Similarly, for the last class with very thin myocardium, the value of γ is increased by a factor of 100-200 while keeping all other hyperparameters fixed.These hyperparameter settings are based on the observations from the ACDC dataset and have been found to provide good performance in their respective image categories.• Performance on ACDC dataset: Figure 18 shows the performance of PICS on all the hundred patients' data, with an average dice score of 0.933.The number of trainable parameters for all the cases is the same, which is 20 parameters.

• Comparison with winners of ACDC challenge:
The best dice score(https:// www.creat is.insa-lyon.fr/ Chall enge/ acdc/ resul ts.html) is 0.96 while ours is a close 0.93.The first two winners of the challenge directly used the   To summarize, PICS operates without the need for labels and utilizes only 20 parameters, in contrast to architectures like U-net that rely on millions of parameters, necessitating a substantial amount of high-quality data for effective training.It's important to note that PICS doesn't directly generate true masks; however, it can be utilized by experts to quickly produce initial masks.For example, in left ventricle segmentation, an expert can click once on the left ventricle, and PICS will provide a high-quality initial mask.The expert can then refine the positions of control knots (weights) and redraw a spline through them, reducing the labeling effort.Therefore, PICS demonstrates promise within the scope of our ongoing study, serving as an initial proof of concept.

Discussion
In this paper, we introduced PICS-an interpretable, physics-informed algorithm for rapid image segmentation in the absence of labeled data.PICS is a novel algorithm that combines the traditional active contour model called snake with the physics-informed neural networks (PINNs).PICS inherits the unique qualities of its parent algorithms (snakes and PINNs), making it intuitive, mesh-free, and respecting the inherent physics of the      traditional energy-based loss functions.The use of cubic splines over deep neural network as basis function and the treatment of spline control knots as design variables further increase its interpretability.We demonstrate that PICS is the first work to minimize the Chan-Vese loss in the snake framework and allows for easy integration of domain expertise via prior shape-based loss terms.PICS uniquely connects 3D segmentation with the solution of an unsteady PDE.By using this connection, as demonstrated by the outcomes on the ACDC dataset, PICS effectively utilizes transfer learning, leading to rapid and efficient segmentation.
However, PICS also faces challenges in inverse parameter estimation and topology changes during 3D segmentation which are given below: 1. Inverse parameter estimation.It may be argued that because PINNs have been successful in both forward and inverse problems and PICS is derived from PINN, it should be possible to estimate hyperparameters by minimizing a loss function with hyperparameters treated as trainable weights.However, Zapf et al. 26 explain that this may not be effective.Even when the predicted segmentation is correct, all three terms of the PICS loss function, including shape regularization, region-based loss, and shape-based loss, may not be equal to zero.Depending on the image complexity, the ratio of region-based loss to other losses may be the most trustworthy at times, while the ratio of shape-based loss to others may be most trustworthy at other times.Therefore, in this work, hyperparameters were chosen through trial and error.For simpler images, like those in Fig. 10, we can select the ratio of region-based loss to other losses as the most trustworthy and automate the hyperparameter selection process.Readers would appreciate that the proposed OPI effectively implements the suggestion by Zapf et al. 26 .

Figure 3 .
Figure 3. Segmentation of enlarged ventricles of a hydrocephalus patient.Left: Initial weights, Right: Optimized weights.

Figure 4 .
Figure 4. Low Dice score: Segmentation of the left ventricle without convexity preserving shape prior.

Figure 5 .
Figure 5. High Dice score: Effect of inclusion of convexity preserving shape prior in segmentation of the left ventricle.

Figure 6 .
Figure 6.Segmentation with the traditional Chan-Vese and snake models.Chan-Vese model does not distinguish between the bright intensities of the left and right ventricle.Snake model sticks to the region with sharp gradients.

Figure 7 .
Figure 7. Two examples of bad minima due to loss function.In both the cases, the order of magnitude of drop in internal energy is higher (or comparable) than that of drop in external energy.

Figure 8 .
Figure 8. OPI trend (low value throughout) for LHS case of Fig. 7.The snake shrinks at all the steps and will ultimately shrink to a point.Dotted red line denotes the threshold.

Figure 9 .
Figure 9. OPI trend (oscillating between 0 and 1) for RHS case of Fig.7.The snake gets stuck in a local minimum after about 100 iterations.Dotted red line denotes the threshold.

Figure 10 .
Figure 10.Performance of PICS with adaptive hyperparameters on a u-shaped cavity.

Figure 11 .
Figure 11.OPI trend of PICS with adaptive hyperparameters for cavity test case.Dotted red line denotes the threshold.

Figure 12 .
Figure 12.PICS hyperparameters tuning for cavity test case.

Figure 13 .
Figure 13.Segmentation of Texas state from the USA map.Left: Initial weights, Right: Optimized weights.

Figure 14 .
Figure 14.PICS hyperparameters tuning for the USA map case.

Figure 15 .
Figure 15.3D segmentation of left ventricle with PICS using transfer learning (for ED phase).

Figure 16 .
Figure 16.Faster convergence due to transfer learning.

Figure 18 .
Figure 18.Performance of PICS in annotating the whole dataset consisting of 100 patients.Average dice score = 0.933.

Figure 19 .
Figure 19.An example of topology change: 3D segmentation of enlarged ventricles of a hydrocephalus patient.
is used.All the experiments are conducted in Matlab R2022b environment running in a 12th Gen Intel(R) Core(TM) i7-12700H, 2.30 GHz CPU and 16GB RAM Asus laptop.

Table 1 .
Experiment settings for 2D and OPI test cases.Columns 2 to 6 have values of loss coefficients.N w in column 7 denotes number of PICS weights.Last column has convergence times.TC-4 takes longer time than others because TC-4 image is of size 256 × 256 , and all others are 128 × 128.